Create the networks

Get the model function, and also load the real life data.

library(tidyverse)
library(gifski)
library(ggraph)
source("modelFunction_rewiring.R")
load("data/toUse.Rda")

Make 5-day graphs from the real data:

realGraphs <- vultureUtils::makeGraphs(edges = toUse, interval = "5 days", 
                       dateTimeStart = "2020-01-01 00:00:00",
                       dateTimeEnd = "2021-09-01 11:59:00",
                       weighted = FALSE, allVertices = TRUE)$graphs
nIndivs <- length(igraph::V(realGraphs[[1]]))

We have length(realGraphs) = 73 timestep graphs.

Run the model with 40 individuals. Setting the burn.in to length(realGraphs) = 73 because for the purposes of testing the baseline dynamics, we don’t actually care about the removal aspect.

modelGraphs <- runModel(N = nIndivs, burn.in = length(realGraphs), doRemoval = FALSE) %>%
  lapply(., function(x){
    igraph::graph_from_adjacency_matrix(x, mode = "undirected")
  })


# Double check that we have the same dimensions throughout
unique(unlist(lapply(modelGraphs, length)))
## [1] 40
unique(unlist(lapply(realGraphs, length)))
## [1] 40
# Good, both are the same.

Compare results from the real and modeled networks.

1) Degree Distributions

Obtain degree information:

fn <- function(x){
  igraph::degree(x) %>%
    as.data.frame()
}
degrees_real <- lapply(realGraphs, fn) %>% 
  setNames(., NULL) %>%
  data.table::rbindlist(idcol = "timestep") %>% 
  mutate(type = "real")

degrees_model <- lapply(modelGraphs, fn) %>%
  setNames(., NULL) %>%
  data.table::rbindlist(idcol = "timestep") %>%
  mutate(type = "model")

# create a single data frame for plotting
degrees <- bind_rows(degrees_real, degrees_model) %>%
  rename("degree" = ".")

Make plots:

degrees %>%
  mutate(timestep = as.factor(timestep)) %>%
  ggplot(aes(x = degree, col = timestep))+
  geom_density()+
  facet_wrap(~type)+
  theme_minimal()+
  theme(legend.position = "none")

These look worryingly different. Let’s see what happens if we remove the 0-degree individuals from the real distributions. Although we’re still going to have to figure out why there are so many more isolated nodes in the real data than in the model data, and how to replicate that in the model.

Removing individuals with degree 0:

degrees %>%
  filter(degree > 0) %>%
  mutate(timestep = as.factor(timestep)) %>%
  ggplot(aes(x = degree, col = timestep))+
  geom_density()+
  facet_wrap(~type)+
  theme_minimal()+
  theme(legend.position = "none")

Okay, these are more similar now, and the model isn’t wildly off. But there’s still clearly some behavior going on here that we need to work harder on replicating in the model.

Mean and variation in degree distribution over time:

degrees %>%
  ggplot(aes(x = timestep, y = degree))+
  geom_point(alpha = 0.2) +
  geom_smooth()+
  facet_wrap(~type)+
  theme_minimal()+
  theme(legend.position = "none")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

Ah, once again, we have a lot more individuals with degree 0 in the real data than in the model. If we remove the individuals with degree 0, we get something a tad more reasonable:

degrees %>%
  filter(degree > 0) %>%
  ggplot(aes(x = timestep, y = degree))+
  geom_point(alpha = 0.2) +
  geom_smooth()+
  facet_wrap(~type)+
  theme_minimal()+
  theme(legend.position = "none")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

But this reinforces the idea that we have to do a better job modeling the disconnected individuals in the model.

Ooh, what happens if I look at individuals’ participation in feeding events–do they only feed every couple days? Maybe I could add a parameter to automatically exclude them from the network if they’ve participated for the past X days. Look into this.

2) Network density

Calculate densities:

density_real <- lapply(realGraphs, igraph::edge_density) %>% 
  setNames(., NULL) %>%
  unlist() %>%
  as.data.frame() %>%
  mutate(type = "real",
         timestep = 1:n())

density_model <- lapply(modelGraphs, igraph::edge_density) %>%
  setNames(., NULL) %>%
  unlist() %>%
  as.data.frame() %>%
  mutate(type = "model",
         timestep = 1:n())

# create a single data frame for plotting
densities <- bind_rows(density_real, density_model) %>%
  rename("density" = ".")

Plot densities:

densities %>%
  ggplot(aes(x = timestep, y = density, col = type))+
  geom_smooth()+
  theme_minimal()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Once again, this seems to confirm that the modeled networks are much more dense than the real networks.

How is it possible that we arrived at this when the probability of generating a new edge was drawn from real distributions? What am I missing?

3) Clustering

Following Farine 2021, going to measure transitivity in these graphs.

trans_real <- lapply(realGraphs, igraph::transitivity) %>% 
  setNames(., NULL) %>%
  unlist() %>%
  as.data.frame() %>%
  mutate(type = "real",
         timestep = 1:n())

trans_model <- lapply(modelGraphs, igraph::transitivity) %>%
  setNames(., NULL) %>%
  unlist() %>%
  as.data.frame() %>%
  mutate(type = "model",
         timestep = 1:n())

# create a single data frame for plotting
transitivities <- bind_rows(trans_real, trans_model) %>%
  rename("transitivity" = ".")

Plot transitivities:

transitivities %>%
  ggplot(aes(x = timestep, y = transitivity, col = type))+
  geom_smooth()+
  theme_minimal()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 4 rows containing non-finite values (stat_smooth).

Oh, this is fascinating. Okay so the modeled graphs are denser than the real ones but have lower transitivity. Transitivity is the “probability that the adjacent vertices of a vertex are connected. This is sometimes also called the clustering coefficient.” This is highly relevant to the question of rewiring, and it’s also very relevant to understanding the biology of the species.

So at this point, we have a working hypothesis that the reason for these different structures is that individuals feed infrequently and in groups. (I’m not sure if that hypothesis explains the transitivity thing, though. Maybe the move is to have them prefer certain other individuals/friends?)

If this hypothesis is correct, then the next questions to ask are: 1) how often does each vulture feed, in the model vs. in real life? 2) how often does each vulture go without feeding at a stretch? 3) how similar are the clusters of individuals feeding together on any given day? I wonder if we have overlapping sets of individuals, e.g. even and odd days. 4) what else?

Make animations of the networks to compare them:

coords <- data.frame(name = names(igraph::V(realGraphs[[1]])),
                     x = rnorm(n = length(igraph::V(realGraphs[[1]]))),
                     y = rnorm(n = length(igraph::V(realGraphs[[1]]))))

And now more animations, but this time letting ggraph choose the layout for each one.